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Abstract 

The  proposed  study  investigates  a  novel  neuro-dynamic  model  which  can  learn  to 
predict  or  regenerate  fluctuated  sequence  patterns  by  extracting  latent  statistical 
structures  in  the  patterns.  The  novelty  of  the  model  is  that  the  fluctuated  sequences  are 
learned  by  adequately  incorporating  stochastic  dynamics  and  deterministic  chaos 
self-organized  in  the  network.  The  model  is  expected  to  bring  the  following  advantages^ 
(l)  adequate  mixtures  of  stochastic  dynamics  and  deterministic  one  can  gain 
representation  power  of  the  model,  (2)  no  needs  for  arbitrary  manipulation  of  data  as 
well  as  interpretation  of  them  by  human,  (3)  possibility  for  scaling  of  the  model  by 
incorporating  with  the  scheme  of  multiple  timescales  dynamics  for  extracting  temporal 
hierarchy  from  the  data.  The  potential  impacts  by  applying  the  model  to  sensory-motor 
sequence  learning  by  robots  as  well  as  video  image  understanding  by  accumulated 
learning  of  the  exemplars  are  discussed. 

1.  Introduction 

Capability  for  learning  to  predict  perceptual  streams  or  encountering  events  by 
acquiring  internal  models  is  indispensable  for  intelligent  or  cognitive  systems  because 
various  cognitive  functions  are  based  on  this  compentency  including  goal-directed 
planning,  mental  simulation  and  recognition  of  the  current  situation.  Learning  to 
predict  is  a  difficult  task  because  time-developments  of  physical  systems  are  often 
observed  as  noisy  or  fluctuated.  In  such  situations,  with  assuming  that  the  phenomena 
are  probabilistic,  model  estimation  based  on  probabilistic  model  are  performed.  By 
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partitioning  the  system’s  state  space  into  a  finite  set  of  discrete  states  with  labels, 
probabilistic  state  transition  models  for  the  system  dynamics  can  be  acquired  by 
counting  events  of  each  state  transition  as  shown  in  the  Hidden  Markov  Modeling 
scheme.  However,  it  is  not  trivial  to  determine  if  the  observed  fluctuated  time  series 
data  are  truly  generated  by  some  statistical  mechanisms  especially  when  the  amount  of 
observed  data  are  not  enough.  This  is  because  it  might  be  still  possible  that  the 
observed  phenomena  are  just  “pseudo  stochastic”  by  meaning  that  they  are  actually 
generated  deterministically  by  means  of  the  initial  sensitivity  characteristics  of  chaos 
which  is  mechanized  in  the  original  continuous  state  space.  Such  examples  can  be  seen 
in  the  studies  of  deterministic  neuro-dynamic  models  (Tani  &  Fukumura,  1994; 
Nishimoto  &  Tani,  2005;  Namikawa  et  al,  2011). 

Although  there  has  been  a  dichotomy  between  determinism  and 
non-determinism  of  allowing  probability  in  modeling  complex  phenomena,  such 
dichotomy  may  not  be  essential  when  biological  brains  or  artificial  cognitive  agents 
attempt  to  develop  internal  models  of  the  world  via  accumulated  direct  observation  or 
perceptual  experiences.  If  deterministic  chaos  or  a  particular  statistical  mechanism  is 
necessary  to  model  a  set  of  observed  phenomena,  either  mechanism  could  be 
self-organized  in  the  course  of  developing  the  model  rather  than  given  a  priori.  The 
mechanism  self-organized  via  accumulated  learning  could  turn  out  to  be  a  merging  of 
deterministic  chaos  and  stochastic  dynamics  rather  than  one  of  them. 

The  primary  motivation  of  the  research  is  to  examine  how  these  two 
mechanisms  can  incorporate  in  developing  effective  models  to  account  for  observed 
temporal  phenomena.  This  research  trial  could  lead  to  (l)  an  opening  of  a  new  theory  for 
handling  fluctuated  data  which  is  beyond  traditional  statistical  theory  of  assuming  law 
of  large  number,  (2)  an  invention  of  a  novel  but  much  simpler  computational  scheme 
which  can  learn  to  predict  as  well  as  recognize  observed  fluctuated  data  in  continuous 
space  and  time  domain  by  utilizing  self-organization  mechanisms  of  neuro-dynamic 
systems. 

The  current  study  utilize  a  dynamic  neural  network  model,  so-called  the 
stochastic  continuous  time  recurrent  neural  network  (S-CTRNN)  model  (Murata  et  al., 
2013)  which  was  developed  in  our  laboratory  previously.  The  current  research 
investigates  the  aforementioned  problems  by  extending  this  model.  Next  section  will 
introduce  the  basic  mechanism  of  S-CTRNN  model  and  explain  how  it  can  be  extended 
for  possible  applications  for  the  current  problem. 

2.  The  stochastic  CTRNN  model  and  its  extension 
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Model 

Here,  we  describe  the  basic  mechanism  of  S-CTRNN  model  (Murata  et  ah,  2013)  which 
can  leam  to  extract  probabilistic  structures  latent  in  a  set  of  exemplar  sequence  patterns 
with  particular  fluctuations.  This  model  is  built  on  the  conventional  CTRNN  model 
which  is  characterized  by  so-called  its  context  loop  consisting  of  context  input  units  ct 
and  context  output  units  ct+1.  The  context  output  units  employ  the  dynamic  of  leaky 
integrator  neuron  with  decay  rate  of  (1  —  x)  where  x  is  time  constant.  The  S-CTRNN 
is  characterized  by  its  capability  of  predicting  subsequent  inputs  not  only  with  their 
means  but  also  with  their  variances  (see  Fig.l). 


Teach 


Fig.l  S-CTRNN  model 


This  implies  that  if  some  parts  of  the  input  sequences  are  more  fluctuated  than  other 
parts,  the  time-dependent  variances  in  these  periods  become  larger.  On  the  other  hand,  if 
some  parts  are  less  fluctuated,  their  variances  become  smaller.  It  can  be  said  that 
S-MTRNN  can  predict  own  predictability  for  each  dimension  of  the  input  sequences  in 
a  time-dependent  manner.  The  network  model  trained  can  reconstruct  the  target 
fluctuated  sequences  in  terms  of  stochastic  dynamics  by  adding  noise  with  the  estimated 
time-varying  variance  to  the  predicted  mean  of  the  output  at  each  step.  For  the  purpose 
of  learning  to  predict  both  average  and  variance  of  each  dimension  of  the  target 
sequences,  the  following  likelihood  function  Lout  is  maximized. 


Lout  rir  nt  rii 


■exp 


2 vr,t,i  ) 


(Eq.  1) 


^2  itvrti 

where  or  t  i  is  the  zth  dimension  of  the  prediction  output  at  time  step  t  in  the  /th 
sequence,  dr  t  i  is  its  teaching  target,  and  vr,t,i  's  its  predicted  variance.  Eq.  1  is  to 
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minimize  the  square  error  divided  by  estimated  variance  at  each  step.  This  means  that 
the  prediction  error  at  a  particular  time  step  is  pressured  to  be  minimized  more  strongly 
when  its  variance  is  estimated  as  smaller.  Otherwise,  the  prediction  error  is  minimized 
less  strongly.  In  the  course  of  iterative  learning  with  a  set  of  target  sequences,  this 
likelihood  function  is  maximized  by  optimizing  connectivity  weights  and  the  initial 
context  states  estimated  for  all  corresponding  target  sequences.  After  iterative  learning 
for  maximizing  Lout,  the  target  sequences  can  be  reconstructed  by  means  of  stochastic 
dynamics  parameterized  by  the  time-varying  variance  estimated  by  the  model. 

Now,  we  explain  an  extension  of  the  S-CTRNN  model  for  the  purpose  of 
investigating  the  dichotomy  between  the  stochastic  model  and  the  deterministic 
dynamics  model.  As  it  is  well  known  that  deterministic  chaos  develops  by  utilizing  the 
initial  sensitivity  characteristics  of  nonlinear  dynamic  system.  Therefore,  we 
hypothesize  that  a  particular  control  of  the  initial  sensitivity  in  the  network  dynamics 
during  the  learning  process  could  manipulate  development  of  chaotic  dynamic 
structures  in  the  network.  Our  intuition  is  that  if  the  initial  sensitivity  is  positively 
utilized  by  allowing  large  variability  in  the  distribution  of  the  initial  context  states  to  be 
determined  in  the  course  of  learning,  fluctuations  in  the  target  sequences  could  be 
represented  by  developing  deterministic  chaos  while  minimizing  estimation  of  the 
output  variances.  Otherwise,  the  fluctuations  could  be  reconstructed  as  driven  by  noise 
term  of  which  variance  is  estimated  with  relatively  large  value.  By  following  this  idea, 
an  additional  likelihood  functions  Linit  is  considered  which  controls  the  distribution  of 
initial  context  unit  states  detennined  for  the  set  of  target  sequences. 


Linit  Or  Ot ' 


271(7] 
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(Eq.  2) 


where  ofs  is  the  predefined  variance  that  confines  variability  of  a  set  of  the  initial 
context  unit  states  for  all  teaching  target  sequences,  if  is  the  optimized  mean  of  the  zth 
dimension  internal  value  of  the  initial  context  unit  states  among  all  sequences,  and 
ur  o  i  is  the  zth  dimension  initial  state  for  the  rth  sequence.  Eq.  2  is  to  put  specific 
probabilistic  distribution  constraints  on  determining  the  optimal  initial  context  states  for 
all  sequences  with  the  parameter  ofs.  If  the  ofs  is  set  with  a  large  value,  the 
distribution  of  the  initial  context  states  becomes  wide.  Otherwise,  it  becomes  tight.  In 
the  proposed  extended  model,  the  following  likelihood  function  InLau  —  lnL0Ut  + 
InLinit  is  maximized.  By  maximizing  the  likelihood  Laa  during  the  learning  process, 
optimal  connectivity  weights  common  to  all  target  sequences,  the  initial  state  for  each 
target  sequence  and  the  estimates  of  time-dependent  variance  for  each  sequence  are 
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obtained  depending  on  the  parameter  afs. 


Learning  to  reconstruct  stochastic  FSM 

In  the  following  simulation  experiment,  we  test  the  model  with  an  example  of  learning 
to  reconstruct  a  particular  stochastic  finite  state  machine  (S-FSM)  from  the  exemplar 
sequences  (see  Fig.2.) 


010011011010011010010.... 


S-CTRNN 


Fig.2  Learning  to  reconstruct  a  target  S-FSM  by  a  S-CTRNN  model 

The  target  S-FSM  generates  deterministic  sequence  of  “0,  1”  in  the  state  transition  from 
the  state  1  to  state  3  whereas  it  generates  0  or  1  with  equal  probability  in  the  state 
transition  from  the  state  3  to  the  state  1.  S-CTRNN  with  10  context  units,  12  hidden 
units  and  two  output  units,  one  for  prediction  of  the  mean  and  the  other  for  the 
estimation  of  variance  is  utilized  for  learning  target  sequences  generated  by  the  target 
S-FSM.  The  time  constant  x  is  set  as  1.0.  This  means  that  CTRNN  used  in  the 
experiment  turns  out  to  be  an  RNN  with  discrete  time  operation.  The  target  sequences 
consist  of  10  sequences  each  of  which  is  generated  with  25  step  length.  The  same  target 
sequences  are  learned  with  two  different  learning  conditions,  so-called  the  narrow  initial 
states  distribution  with  setting  ofs  as  0.001  (Narrow  IS)  and  the  wide  initial  state 
distribution  with  setting  ofs  as  1.0  (Wide  IS). 

Fig. 3  (a)  and  Fig. 3  (b)  show  the  reconstruction  of  the  target  with  the  network 
model  trained  under  the  narrow  initial  states  distribution  and  the  wide  one,  respectively. 
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Fig.3  Output  sequences  with  X:  predicted  mean  and  V:  estimated  variance  with  (a) 
the  Narrow  IS  case  and  with  (b)  the  Wide  IS  case. 


It  the  narrow  distribution  case  in  Fig. 3(a),  we  can  see  3  steps  length  periodic  sequence 
pattern  of  repeating  0.0,  1.0,  0.5  in  the  out  of  mean  value  and  0.0,  0.0,  0.25  in  the  output 
of  variance  estimation  with  synchronization.  This  corresponds  to  the  deterministic 
state  transitions  from  the  state  1  to  the  state  2  and  to  the  state  3  and  probabilistic  one 
returning  to  the  state  1  in  the  target  S-FSM.  We  can  see  that  the  underlying  probabilistic 
structure  of  the  target  S-FSM  is  well  reconstructed  in  the  trained  model  in  terms  of 
stochastic  dynamics.  On  the  other  hand  in  the  wide  distribution  case,  we  can  see  the 
repetitions  of  3  steps  sequence  composed  of  0.0,  1.0,  “?”  in  which  “?”  comes  either 
close  to  1.0  or  to  0.0  seemingly  at  random  in  the  output  of  mean  whereas  the  output  of 
variance  estimation  becomes  almost  zero.  This  implies  that  the  sequences  are 
reconstructed  in  terms  of  detenninistic  dynamic  system.  Actually,  development  of 
deterministic  chaos  was  confirmed  by  observing  a  positive  value  for  the  maximum 
value  of  Lyapunov  exponents  through  the  analysis  of  the  obtained  dynamic  trajectories. 
These  simulation  results  show  that  the  extended  S-CTRNN  can  leam  to  imitate  the 
output  sequences  of  S-FSM  by  reconstructing  them  either  in  stochastic  dynamics  or 
deterministic  chaos  depending  on  the  learning  condition  imposed  on  the  initial 
sensitivity  characteristic  of  the  network  dynamics. 


Learning  to  imitate  movement  patterns  generated  with  probabilistic  decision 
sequences. 

In  the  real  world  situation,  perceptual  sequences  could  be  continuous  in  time  and  also 
they  could  be  hierarchically  organized.  An  interesting  question  might  be  how  the 
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network  model  can  learn  to  extract  latent  probabilistic  decision  structures  in  the  higher 
cognitive  level  of  other  agents  from  its  lower  level  continuous  perceptual  experience. 
On  the  purpose  of  investigating  this  problem,  we  performed  robot  experiments  using  a 
humanoid  robot  “NAO”.  For  the  robot  task,  the  robot  learns  to  imitate  tutor  guided 
behaviors  including  probabilistic  switching  between  different  action  primitives.  For  the 
current  robotics  experiment,  S-CTRNN  model  was  further  extended  such  that  it  can  deal 
with  multiple  timescale  property  as  shown  in  our  prior  study  on  multiple  timescale 
recurrent  neural  network  (MTRNN)  model  (Yamashita  &  Tani,  2008).  Our  speculation 
is  that  a  newly  proposed  model  of  S-MTRNN  can  leam  to  predict  hierarchically 
organized  fluctuated  patterns  by  utilizing  the  multiple  timescales  property.  The  proposed 
S-MTRNN  (Fig.4)  contains  two  clusters  of  the  fast  context  units  (cF)  with  smaller  x 
and  the  slower  context  units  (cs)  with  larger  x.  It  receives  the  current  proprioception 
state  (the  encoder  readings  of  4  DOF  joint  angles)  and  generates  the  one  in  next  time 
step  in  continuous  manner.  Number  of  the  fast  context  units  and  the  slow  context  units 
employed  were  30  and  10,  respectively.  The  time  constant  x  was  set  as  5  for  the  fast 
context  units  and  30  for  the  slow  ones. 


Joint  angle  Xarget 

Current  joint  angle  prediction  joint  angle 


Fig.  4  S-MTRNN  model. 


During  the  tutoring  session  for  NAO  robot,  the  experimenter  tutored  two  types 
of  arm  movement  actions,  one  for  moving  the  arm  to  the  left-hand  side  and  then 
returning  back  to  the  center  position  and  the  other  for  moving  the  ann  to  the  right-hand 
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side  and  then  returning  back  to  the  center  position  by  repeating  them  in  random  orders 
in  sequences  (Fig.  5). 


Fig.5  Tutoring  of  NAO  robot  by  direct  guidance.  Repeatedly  guiding  two  different 
arm  movements  in  arbitrary  sequential  combination,  one  from  center  (red  block 
position)  to  left  (green  block)  and  the  other  from  center  to  right  (blue  block). 

Each  tutoring  trial  consists  of  5  successive  switching  and  all  trials  cover  all  possible 
combinations  of  25  sequences.  In  the  training  of  the  S-MTRNN  with  the  tutoring 
patterns,  the  training  was  repeated  twice  with  setting  ofs  with  a  small  value  and  a  large 
value  in  order  to  generate  a  narrow  initial  state  distribution  (Narrow-IS)  and  a  wide 
initial  state  distribution  (Wide-IS),  respectively.  After  the  training  of  the  S-MTRNN  was 
completed,  the  robot  movement  was  generated  for  both  training  cases  by  means  of  the 
so-called  closed-loop  operation.  In  the  closed-loop  operation  with  the  S-MTRNN,  a 
Gaussian  noise  with  the  estimated  variance  at  each  step  is  added  to  the  feedback  from 
previous  step  prediction  outputs  to  the  current  step  input. 

Fig.  6  (a)  and  (b)  illustrate  examples  of  behavior  generation  in  terms  of  the 
proprioception  sequence  associated  with  the  estimated  variance  and  the  internal  neural 
activities  in  the  fast  and  the  slow  context  units  generated  by  the  network  trained  under 
Narrow-IS  and  Wide-IS  conditions,  respectively. 
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Fig.  6.  The  robot  behavior  generation  (a)  in  Narrow-IS  case  and  (b)  in  Wide-IS 
case. 


In  both  cases,  it  can  be  observed  that  two  behaviors  of  moving  to  right  or  moving  to  left 
are  alternated  arbitrarily.  In  the  Narrow-IS  case,  the  estimated  variances  showed  their 
sharp  peaks  at  the  decision  points  but  almost  zero  at  other  time  steps.  This  implies  that 
the  trained  S-MTRNN  developed  action  primitives  of  moving  to  left  and  right  as  two 
distinct  chunks  and  their  probabilistic  switching  mechanism  at  the  decision  points  by 
utilizing  the  estimated  large  variance  in  those  time  points.  Therefore,  it  can  be  said  that 
probabilistic  decision  mechanism  was  developed  in  the  training  condition  of  Narrow-IS. 
On  the  other  hand  in  the  Wide-IS  case,  the  variance  is  estimated  as  almost  zero  for  all 
steps  including  the  decision  point.  This  implies  that  motor  behavior  is  generated  as  an 
initial  sensitive  deterministic  dynamics  in  the  Wide-IS  condition.  Although  we  expected 
the  development  of  chaos  again  in  this  experiment,  the  largest  Lyapunov  exponent 
turned  out  to  be  negative  which  denied  our  expectation.  However,  we  observed  that 
seemingly  random  switching  at  the  decision  point  continues  more  than  10  consecutive 
switching  times  before  converging  into  a  particular  periodic  sequence.  This  implies  that 
spontaneous  switching  was  mechanized  by  transient  chaos.  For  relatively  long  period, 
arbitrary  sequential  combinations  of  two  action  primitives  can  be  generated  depending 
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on  the  initial  context  state. 

Another  interesting  observation  was  that  the  internal  neural  activity  was 
observed  as  quite  different  between  the  two  cases.  In  the  Narrow-IS  case,  the  neural 
activities  in  both  of  the  slow  and  the  fast  subnetworks  showed  the  same  values  for  all 
decision  points.  On  the  other  hand  in  the  Wide-IS  case,  both  of  the  slow  and  fast 
neurons  exhibit  specific  activation  patterns  at  each  decision  point  which  can  predict  the 
forthcoming  behavior  of  either  moving  to  left  or  right.  This  implies  that  there  was  no 
bias  in  the  neutral  activity  at  the  moment  of  the  decision  in  the  Narrow-IS  case  whereas 
there  were  top-down  predictive  biases  represented  with  specific  neural  activation 
patterns  in  the  decision  points  in  the  Wide-IS  case.  Finally,  we  consider  contribution  of 
the  multiple  timescale  property  of  the  employed  model  to  the  imitative  learning  of 
hierarchically  organized  probabilistic  decision  behavior.  Our  additional  experiments 
revealed  that  the  network  model  without  the  slow  context  units  could  leam  the  task  with 
the  Narrow-IS  condition  but  not  with  the  Wide-IS  condition.  The  slow  dynamic  part  was 
necessary  in  the  Wide-IS  condition  because  transient  chaos  which  enables  spontaneous 
switching  of  the  primitive  actions  stored  in  the  fast  dynamics  part  was  developed  in  the 
slow  dynamics  part. 

The  current  experiment  results  showed  that  S-MTRNN  model  which  is 
characterized  by  the  multiple  timescale  property  can  leam  to  reconstruct  continuous 
perceptual  flow  fluctuated  as  triggered  by  sequences  of  probabilistic  decisions.  It  was 
observed  that  stochastic  dynamic  was  developed  by  less  utilizing  the  initial  sensitivity 
characteristics  while  deterministic  dynamic  with  transient  chaos  did  by  more  utilizing 
the  initial  sensitivity  characteristics  in  the  learning  processes.  This  result  is  analogous  to 
the  one  shown  in  our  previous  simulation  experiments  on  the  imitative  learning  of 
S-FSM  output  sequences. 

Two-leveled  S-MTRNN  to  extract  probabilistic  structures  latent  in  different 
timescales  dynamics 

We  observed  several  limitations  of  the  S-MTRNN  in  the  second  experiment.  First,  the 
slow  context  unit  activities  were  not  developed  so  well  in  the  Narrow-IS  condition. 
Second,  although  the  variance  peak  generated  at  each  switching  point,  predicted 
temporal  sequence  became  sharp  at  each  switching  point.  One  hypothesis  is  that 
variance  was  only  connected  with  fast  context  units  or  fast  timescale  network.  That’s 
why  variance  generation  merely  depends  on  fast  timescale  dynamics.  To  overcome  this 
limitation,  we  suggest  extended  S-MTRNN,  called  two-leveled  S-MTRNN. 

Two-leveled  S-MTRNN  (Fig.  7)  consists  of  two  sub-networks,  each 
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characterized  by  different  timescale.  Fast  timescale  network  contains  fast  context  units 
with  small  time  constant.  It  receives  the  current  input  If  to  generate  the  one  in  next 
time  step  as  like  the  S-MTRNN.  The  fast  timescale  network  connected  with  newly 
proposed  units,  called  pseudo  target  units  of.  Slow  timescale  network  contains  slow 
context  units  with  large  time  constant  to  generate  pseudo  target  units  in  closed-loop 
phase.  It  receives  the  current  input  if  which  comes  from  pseudo  target  units  for 
previous  time  step  in  training  phase. 


Unlike  the  previous  models,  training  phase  was  separated  into  two  stages  as 
like  a  mixture  of  RNN  experts  (Namikawa,  2008).  In  first  training  phase,  the  fast 
timescale  network  trained  and  the  values  of  the  pseudo  target  units  are  self-organized 
while  slow  timescale  network  fixed,  by  maximize  the  likelihood  function  Lfast  defined 
as  follows: 

Lfast  —  Or  Ot  Oi  r 

]2nVr.t.i 

where  ofti  is  ith  dimension  of  the  output  of  fast  timescale  network  at  time  step  t  in 
the  rth  sequence,  ofti  is  its  teaching  target,  and  vfti  is  the  variance  of  fast  timescale 
network.  The  pseudo  target  play  key  role  in  this  model  by  competing  with  fast  output 
and  fast  variance.  More  detail,  probabilistic  structure  latent  in  slower  timescale 
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dynamics  which  cannot  be  properly  minimized  by  fast  output  and  fast  variance,  can  be 
minimized  by  self-organizing  the  pseudo  target. 

In  second  training  phase,  slow  timescale  network  trained  by  minimizing 
likelihood  function  Lstow  based  on  trained  fast  timescale  network. 


L 


slow 
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(Eq.  3) 


where  o^ti  is  ith  dimension  of  the  output  of  slow  timescale  network  at  time  step  t  in 
the  rth  sequence,  i  is  the  pseudo  target  self-organized  in  first  training  phase,  and 
v$  t  i  is  the  variance  of  slow  timescale  network.  During  the  second  training  phase,  slow 
timescale  network  try  to  generate  the  pseudo  target  by  means  of  slow  output  and  slow 
variance.  By  doing  that,  probabilistic  structure  which  captured  by  pseudo  target,  can  be 
redistributed  into  slow  output  and  slow  variance. 

To  verify  the  capability  of  the  two-leveled  S-MTRNN,  we  conducted  simple 
decision  making  experiment  using  two  different  computer  generated  temporal 
sequences  (Fig.  8).  Two  temporal  sequences  exactly  same  until  100  time  step.  After  that 
one  goes  down  the  other  goes  up.  The  purpose  of  the  experiment  is  to  see  whether 
probabilistic  structure  in  slow  timescale  dynamics  can  be  captured  by  the  two-leveled 
S-MTRNN  or  not.  For  this  experiment,  30  fast  context  units  with  time  constant  x  set  to 
5  in  fast  timescale  network  and  1  pseudo  target  unit  while  10  slow  context  unit  with 
time  constant  x  set  to  100  in  slow  timescale  network.  The  two-leveled  S-MTRNN 
trained  with  two  different  training  conditions:  (1)  updating  initial  context  state 
(detenninistic  case)  and  (2)  fixed  initial  context  state  (stochastic  case). 
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Fig.  8.  Computer  generated  2-dimensional  branching  sequences 
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Fig.  9.  The  internal  dynamics  of  the  two-leveled  S-MTRNN  (a)  in  deterministic 
case  and  (b)  stochastic  case. 
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In  Fig.  9,  self-organized  structures  of  the  pseudo  target  were  similar  while  fast 
variance  was  almost  zero  regardless  of  training  conditions.  It  means  slow  probabilistic 
structure  within  temporal  sequences,  such  as  branching  prediction,  can  be  more  easily 
extracted  by  the  pseudo  target  rather  than  fast  output  and  fast  variance.  The  significant 
difference  between  training  conditions  comes  in  slow  timescale  networks.  In  the 
deterministic  case,  each  initial  state  of  the  slow  context  for  each  target  sequence  is 
totally  separated  and  slow  variance  was  almost  zero.  These  results  showed  that  network 
already  knew  the  direction  of  branching  from  the  starting  point  in  terms  of  initial  state 
of  slow  context.  This  result  is  analogous  to  the  robotics  experiment  results  shown  in 
(Tani,  2014).  On  the  other  hand,  in  the  stochastic  case,  network  cannot  predict  the 
direction  of  branching  because  the  initial  state  of  the  context  was  exactly  same.  But, 
network  can  predict  branching  point  through  slow  variance  peak  generation  at 
branching  point.  In  brief,  the  pseudo  target  extract  slow  probabilistic  structure  in  the 
first  training  phase,  and  then  slow  probabilistic  structure  extracted  by  the  pseudo  target, 
was  re-extracted  by  the  slow  timescale  network  in  stochastic  dynamics  or  deterministic 
dynamics  depends  on  training  conditions  in  the  second  training  phase.  At  the  same  time, 
in  stochastic  case,  predicted  temporal  sequence  showed  smooth  transition  in  the  closed 
loop  generation.  Because  generated  noisy  of  the  slow  variance  added  to  slow  context 
dynamics,  not  directly  added  to  fast  context  dynamics.  The  simulation  results  in  this 
experiment  indicate  the  two-leveled  S-MTRNN  successfully  extract  slow  probabilistic 
structure  in  slow  timescale  dynamics  especially  thanks  to  the  pseudo  target. 

3.  Summary 

The  current  study  investigated  a  novel  neuro-dynamic  scheme  by  which  fluctuated 
sequence  patterns  generated  by  particular  target  sources  can  be  reconstructed  by 
extracting  latent  statistical  structures  in  the  target  patterns  via  iterative  learning.  The 
uniqueness  of  the  proposed  scheme  is  that  the  fluctuated  sequences  are  learned  by 
adequately  incorporating  stochastic  dynamics  and  detenninistic  chaos  self-organized  in 
the  network  depending  on  the  initial  sensitivity  condition  set  in  the  learning  processes. 
If  the  initial  sensitivity  is  utilized  in  the  Narrow  IS  condition,  non-zero  value  for 
time-varying  variance  is  estimated  along  with  the  prediction  of  the  mean  of  the  target  at 
each  step.  The  target  sequence  is  regenerated  in  terms  of  stochastic  dynamics  because 
the  sequences  are  generated  by  adding  noise  of  the  estimated  variance  to  the  predicted 
mean  at  each  step.  On  the  other  hand,  if  the  initial  sensitivity  is  not  utilized, 
deterministic  dynamic  of  chaos  or  transient  chaos  is  developed  by  estimating  the 
time-varying  variance  as  zero  for  all  steps. 
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The  aforementioned  principle  was  well  evaluated  by  conducting  a  set  of 
experiments  associated  with  a  series  of  extensions  in  the  basic  model.  The  first 
simulation  experiment  showed  that  how  the  basic  model  of  S-CTRNN  can  learnt  to 
reconstruct  of  a  target  S-FSM  by  extracting  its  probabilistic  structure.  It  was  shown  that 
the  state  transition  with  probabilistic  branching  in  the  target  S-FSM  could  be  imitated 
by  means  of  estimating  adequate  variance  at  the  branching  step  in  the  result  of  learning 
with  the  Narrow-IS  condition  or  by  developing  detenninistic  chaos  with  the  Wide-IS 
condition. 

The  second  experiment  with  utilizing  a  robot  showed  that  how  the  extended 
model  of  S-MTRNN  can  learnt  to  reconstruct  continuous  perceptual  sequences  which 
are  generated  by  particular  probabilistic  decision  sequences.  It  was  shown  that 
S-MTRNN  learned  two  different  action  primitives  by  utilizing  the  fast  context  unit 
activities  in  both  of  the  Narrow-IS  and  the  Wide-IS  conditions.  However,  the 
mechanism  of  the  probabilistic  switching  between  these  primitive  actions  was 
developed  differently  between  these  two  conditions.  In  the  Narrow-IS  condition,  it  was 
observed  that  the  time-varying  variance  was  estimated  with  a  peak  value  at  each 
moment  of  the  switching  decision.  It  seemed  that  the  activity  in  the  slow  context  units 
contributed  less  to  the  whole  system  performance.  On  the  other  hand  in  the  Wide-IS 
condition,  it  was  found  that  the  probabilistic  switching  between  two  primitive  actions 
was  mechanized  by  the  transient  chaos  developed  in  the  slow  context  activity.  One 
question  was  arisen  by  obtaining  this  result.  The  question  is  why  the  slow  context  unit 
activity  cannot  be  utilized  in  the  Narrow-IS  condition.  If  the  switching  decision  is 
originated  by  the  fluctuation  in  the  higher  cognitive  level,  such  fluctuation  should 
appear  also  in  the  slow  dynamic  part  in  the  higher  level  in  the  model. 

The  third  simulation  experiment  was  conducted  in  order  to  investigate  this 
problem.  The  two-level  S-MTRNN  trained  to  regenerate  two  branching  sequences 
under  two  different  conditions:  stochastic  case  and  deterministic  case.  The  training 
phase  divides  into  two  sub-training  phase.  In  first  training  phase,  fast  timescale  network 
was  trained  while  self-organizing  the  pseudo  target.  There  was  no  significant  difference 
between  two  training  conditions  in  this  phase.  In  second  training  phase,  slow  timescale 
network  was  trained  using  trained  fast  timescale  network  and  the  pseudo  target.  Two 
training  conditions  were  differentiated  in  second  training  phase.  Probabilistic  structure 
extracted  by  the  pseudo  target  can  be  re-extracted  by  initial  context  state  or  slow 
variance  depends  on  training  condition.  As  we  expected,  the  two-leveled  S-MTRNN 
utilized  slow  context  dynamics  and  generated  smooth  transition  of  predicted  temporal 
sequence,  even  if  in  stochastic  case.  But,  the  two-leveled  S-MTRNN  has  been  still  in 
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preliminary  stage  and  requires  further  studies.  For  the  future  work,  we  will  provide 
updated  version  of  the  two-leveled  S-MTRNN  which  can  train  fast  timescale  network 
and  slow  timescale  network  at  the  same  time. 
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